There are several packages and options available to construct and analyze ecological networks. Here, we will use the bipartite package (Dormann et al. 2008) to plot a basic site by species network and then apply Beckett’s (2016) DIRTLPAb+ algorithm to perform community detection by bipartite modularity analysis. Once the communities are found, we will work through some visualization options. For this demonstration, we will use phytoplankton community data from Alberta, Canada.

Objectives

1. Background

(from Loewen et al. 2021a. Bioregions are predominately climatic for fishes of northern lakes. Global Ecology and Biogeography)

Modularity analysis provides unsupervised learning of community structure based solely on network topology, where divisions are made to maximize the number of edges within, rather than between, resultant modules (Newman & Girvan 2004). While there are several methods available for clustering data, most distance-based procedures flatten bipartite networks, reducing species information to some measure of site dissimilarity. Modularity analysis can be used to maintain the bipartite nature of site-species networks, identifying both groups of sites that share distinct biotas and sets of species that tend to co-occur.

While modularity analysis was initially developed for the unipartite case (Newman & Girvan 2004), a bipartite formulation was proposed by Barber (2007) to allow links only between nodes of opposing types. Barber’s definition classifies both sets of nodes simultaneously and produces one-to-one node correspondence (i.e. it will produce the same number of modules for sites and species), associating assemblages directly to their physical locations.

Modularity optimization is computationally challenging (NP-hard; Miyauchi & Sukegawa 2015), necessitating heuristics to search through solution space of larger networks. The DIRTLPAb+ algorithm of Beckett (2016) is a label propagation approach to maximize Barber’s bipartite modularity. Briefly, the label propagation algorithm (LPA) for community detection was proposed by Raghavan et al. (2007), where nodes were initialized with all unique labels, updated iteratively to adopt the most common labels of their neighbours (with ties broken uniformly), and grouped together once no further improvements could be made. The routine was subsequently modified to the bipartite case (LPAb); however, it had a tendency to become trapped in local maxima. To escape such traps, Liu & Murata (2010) incorporated a multistep greedy agglomerative algorithm (LPAb+). Their approach applied asynchronous label propagation (updating node labels of each type in turn) but added a step where resulting groups could be merged if doing so improved the solution, iterating between propagation and aggregation phases. Finally, because label propagation is inherently stochastic, and thus outcomes can vary depending on initialization, Beckett (2016) proposed that LPAb+ should be repeated under different node configurations (i.e. numbers of unique starting labels) and report the greatest ensuing score (DIRTLPAb+).

2. Building the network

For this tutorial, we will use R software (https://www.r-project.org/). We start by opening R, setting the working directory, and loading the bipartite and dplyr packages.

# Load packages
library(bipartite)
library(dplyr)

Next we read our ecological community data matrix containing species (columns) and sites (rows). The matrix can be binary (0s and 1s) or weighted (e.g. by abundance or biomass). For this demonstration, we will use data on phytoplankton communities in lakes and reservoirs across Alberta, Canada (see Loewen et al. 2020 for details). These data are presented as biomass for individual taxa at lakes and reservoirs on different sampling dates, and are available for download from the Dryad Digital Repository (https://doi.org/10.5061/dryad.gf1vhhmk7).

Download the data into your working directory and load it into the R environment. As data from multiple sampling events are provided fore each site, we will subset the matrix to work with records obtained in August only and remove extra fields.

# Read the species data
phyto.comm <- read.csv("Loewen_et_al._2020._Ecol_App._Species_data_-_final.csv", header = T)

# Subset data to select August sampling events
phyto.comm.aug.bio <- subset(phyto.comm, Month == "Aug")

# Set Lake.Name as row names
rownames(phyto.comm.aug.bio) <- phyto.comm.aug.bio$Lake.Name

# Remove extra fields
phyto.comm.aug.bio <- phyto.comm.aug.bio[, 5:308]

# Remove any species not found in August
i <- (colSums(phyto.comm.aug.bio) != 0)
phyto.comm.aug.bio <- phyto.comm.aug.bio[, i]

# Create binary community matrix (presence/absence) from weighted matrix
phyto.comm.aug.bin <- phyto.comm.aug.bio
phyto.comm.aug.bin[phyto.comm.aug.bin > 0] <- 1

Now that we have our data wrangled, we can visualize them as bipartite networks. Lets plot the unweighted bipartite graph first.

# Default bipartite graph plot
plotweb(phyto.comm.aug.bin, text.rot = 90)

Now plot a bipartite graph weighted by biomass.

# Default bipartite graph plot
plotweb(phyto.comm.aug.bio, text.rot = 90)

While there are too many species to make out the details in these plots, nodes at the top represent individual taxa while sites are represented along the bottom. Note that in the second (weighted) plot, thicker links between nodes indicate relatively greater biomass for a species at a particular site, whereas all links are the same thickness in the first (unweighted, presence/absence) plot. Node thickness also varies, indicating relative biomass in the weighted graph and relative occurrence in the unweighted graph.

3. Analyzing the network

While different approaches may reveal different patterns, we will use the computeModules function in the bipartite package to find groups of densely linked sites and species (i.e. modules) by maximizing Barber’s index. First, we will analyze the unweighted network.

# Set random seed for reproducibility of random process
set.seed(99)

# Conduct bipartite modularity analysis on binary network using Beckett's algorithm
bin.DIRT <- computeModules(phyto.comm.aug.bin, method = "Beckett")

# Determine number of modules
nrow(bin.DIRT@modules) - 1
## [1] 6
# Calculate participation coefficient
bin.DIRT.cz.higher <- czvalues(bin.DIRT, level = "higher")
bin.DIRT.cz.lower <- czvalues(bin.DIRT, level = "lower")

# Check modularity (Q) value of proposed module structure
bin.DIRT@likelihood
## [1] 0.2477575

The algorithm detected six modules with a modularity value of 0.2477575, but are these modules truly non-random?

For this, we can compare the computed value to those obtained from a series of random networks. The issue of obtaining random networks is also non-trivial, and several approaches exist. Here, we will use the sequentially swapping curveball algorithm (Strona et al. 2014), which has been shown to efficiently produce uniformly distributed null matrices that maintain row (site) and column (species) totals for presence/absence data. Because the approach is sequential (i.e. further randomized with each step), it requires a large number of iterations as well as a burn-in period. We will use the oecosimu function in the vegan package (Oksanen et al. 2020) to run the simulations in parallel. We will use 5,000 burn-in and thinning steps, which discard iterations at the beginning and between each sample, respectively.

# Load packages
library(parallel)
library(vegan)

# Create a function that captures the modularity values of module structures
func <- function(x) computeModules(x)@likelihood

# Set random seed for reproducibility
set.seed(99)

# Detect and set the number of available cores to run simulations in parallel
mc.cores <- parallel::detectCores()
clus <- makeCluster(mc.cores)
clusterEvalQ(clus, library(bipartite))
# Calculate module structure for 20 null weighted networks, obtained using the curveball algorithm, and conduct one-sided randomization test
bin.test <- oecosimu(comm = phyto.comm.aug.bin, nestfun = func, method = "curveball", nsimul = 20, burnin = 5000, thin = 5000, statistic = "likelihood", alternative = "greater", parallel = clus)

# Stop the cluster
stopCluster(clus)

# Extract P-value
bin.test[["oecosimu"]][["pval"]]
## [1] 0.04761905
# Plot observed and simulated modularity values
plot(density(bin.test[["oecosimu"]][["simulated"]]), xlim = c(min(bin.test[["statistic"]], min(bin.test[["oecosimu"]][["simulated"]])), max(bin.test[["statistic"]], max(bin.test[["oecosimu"]][["simulated"]]))), main = "Modularity Null Comparisons", xlab = "Modularity (Q)", ylab = "Density")
abline(v = bin.test[["statistic"]], col = "red", lwd = 2)

In this plot, the red line represents the modularity statistic for the proposed module structure, while the density plot shows values obtained for random networks. As we can see, the proposed module structure is highly non-random, as the observed modularity statistic is greater than any obtained from the random networks (P < 0.05). However, while we only generated 20 null matrices for demonstration purposes, several hundred (or thousand) should be used to obtain a null distribution for formal analyses (this can take a long time with large networks). It is also important to check that null matrices generated by sequential swapping procedures have converged on a random structure. Autocorrelation is avoided by using large numbers of burn-in and thinning steps.

4. Visualizing the proposed module structure

Now, lets visualize the proposed module structure. We’ll start with the site modules. Here, we’ll use ggmap and ggplot2 (supplemented with ggsn to add in a scale bar and north arrow). We’ll extract site modules and merge these results with site coordinates (obtained from the environmental data matrix in Loewen et al. 2020) and plot the results on a terrain basemap.

# Load packages
library(ggmap)
library(ggsn)
library(ggplot2)

# Read the environmental data
phyto.env <- read.csv("Loewen_et_al._2020._Ecol_App._Environmental_data_-_final.csv", header = T)
phyto.env.aug <- subset(phyto.env, Month == "Aug")
rownames(phyto.env.aug) <- phyto.env.aug$Lake.Name

# Extract the site module structure
bin.DIRT.list <- listModuleInformation(bin.DIRT)

bin.DIRT.site.mod.1 <- bin.DIRT.list[[2]][[1]][[1]]
bin.DIRT.site.mod.1.vector <- as.matrix(if_else(rownames(bin.DIRT@moduleWeb) %in% bin.DIRT.site.mod.1, 1, 0))
bin.DIRT.site.mod.2 <- bin.DIRT.list[[2]][[2]][[1]]
bin.DIRT.site.mod.2.vector <- as.matrix(if_else(rownames(bin.DIRT@moduleWeb) %in% bin.DIRT.site.mod.2, 1, 0))
bin.DIRT.site.mod.3 <- bin.DIRT.list[[2]][[3]][[1]]
bin.DIRT.site.mod.3.vector <- as.matrix(if_else(rownames(bin.DIRT@moduleWeb) %in% bin.DIRT.site.mod.3, 1, 0))
bin.DIRT.site.mod.4 <- bin.DIRT.list[[2]][[4]][[1]]
bin.DIRT.site.mod.4.vector <- as.matrix(if_else(rownames(bin.DIRT@moduleWeb) %in% bin.DIRT.site.mod.4, 1, 0))
bin.DIRT.site.mod.5 <- bin.DIRT.list[[2]][[5]][[1]]
bin.DIRT.site.mod.5.vector <- as.matrix(if_else(rownames(bin.DIRT@moduleWeb) %in% bin.DIRT.site.mod.5, 1, 0))
bin.DIRT.site.mod.6 <- bin.DIRT.list[[2]][[6]][[1]]
bin.DIRT.site.mod.6.vector <- as.matrix(if_else(rownames(bin.DIRT@moduleWeb) %in% bin.DIRT.site.mod.6, 1, 0))

# Wrangle site module structure
bin.DIRT.site.mods <- cbind(bin.DIRT.site.mod.1.vector, bin.DIRT.site.mod.2.vector, bin.DIRT.site.mod.3.vector, bin.DIRT.site.mod.4.vector, bin.DIRT.site.mod.5.vector, bin.DIRT.site.mod.6.vector)

rownames(bin.DIRT.site.mods) <- rownames(bin.DIRT@moduleWeb)
bin.DIRT.site.mods <- merge(bin.DIRT.site.mods, bin.DIRT.cz.lower$c, by = "row.names")
rownames(bin.DIRT.site.mods) <- bin.DIRT.site.mods$Row.names
bin.DIRT.site.mods <- bin.DIRT.site.mods[, -1]

colnames(bin.DIRT.site.mods) <- c("Mod01", "Mod02", "Mod03", "Mod04", "Mod05", "Mod06", "c")
coords2var <- cbind(phyto.env.aug$Latitude.N, phyto.env.aug$Longitude.W)
coords2var <- as.data.frame(coords2var, row.names = row.names(phyto.env.aug))
colnames(coords2var) <- c("Latitude", "Longitude")
bin.DIRT.site.mods <- merge(bin.DIRT.site.mods, coords2var, by = "row.names")
rownames(bin.DIRT.site.mods) <- bin.DIRT.site.mods$Row.names
bin.DIRT.site.mods <- bin.DIRT.site.mods[, -1]

bin.DIRT.site.mods['Mods'] <- NA
bin.DIRT.site.mods$Mods[bin.DIRT.site.mods$Mod01 == 1] <- "Mod01"
bin.DIRT.site.mods$Mods[bin.DIRT.site.mods$Mod02 == 1] <- "Mod02"
bin.DIRT.site.mods$Mods[bin.DIRT.site.mods$Mod03 == 1] <- "Mod03"
bin.DIRT.site.mods$Mods[bin.DIRT.site.mods$Mod04 == 1] <- "Mod04"
bin.DIRT.site.mods$Mods[bin.DIRT.site.mods$Mod05 == 1] <- "Mod05"
bin.DIRT.site.mods$Mods[bin.DIRT.site.mods$Mod06 == 1] <- "Mod06"
bin.DIRT.site.mods$Mods <- as.factor(bin.DIRT.site.mods$Mods)
# Plot binary module map

library(GISTools)

basemap <- get_stamenmap(bbox = c(left = -max(bin.DIRT.site.mods$Longitude) - 2, bottom = min(bin.DIRT.site.mods$Latitude) - 2, right = -min(bin.DIRT.site.mods$Longitude) + 2, top = max(bin.DIRT.site.mods$Latitude) + 2), zoom = 5, maptype = "terrain-background")

bin.site.mod.map <- ggmap(basemap, maprange = TRUE) +
  geom_point(data = bin.DIRT.site.mods, aes(-Longitude, Latitude, color = Mods), shape = 16, size = 3) +
  geom_point(data = bin.DIRT.site.mods, aes(-Longitude, Latitude), color = "black", shape = 1, size = 3) +
  scale_x_continuous("Longitude", breaks = c(-120, -115, -110), 
                     labels = c("-120", "-115", "-110"), limits = c(-120, -110)) +
  scale_y_continuous("Latitude", breaks = c(49, 54, 59), 
                     labels = c("49", "54", "59"), limits = c(49, 59.7)) +
  labs(color = "Modules", title = "Unweighted site modules") +
  scalebar(x.min = attr(basemap, "bb")[[2]], 
           y.min = attr(basemap, "bb")[[1]], 
           x.max = attr(basemap, "bb")[[4]], 
           y.max = attr(basemap, "bb")[[3]], 
           dist = 100, anchor = c(x = -119.6, y = 49.6), transform = T, 
           location = "bottomleft", st.size = 3, st.dist = 0.032, dist_unit = "km") +
  theme(panel.border = element_rect(colour = "black", fill = NA),
        plot.title = element_text(hjust = 0.5),
        legend.key = element_rect(fill = "white"))
north2(bin.site.mod.map, x = 0.9, y = 0.9, symbol = 16)

The six modules are distributed across the province, showing no clear spatial pattern, and suggesting a potentially important role of local factors (including water quality and biotic interactions) in driving community differentiation. We can also plot the participation coefficients of each site, which is a measure of among-module connectivity (Guimerà & Amaral 2005). Generally, higher participation coefficients indicate transition zones in spatially structured data.

# Plot binary participation coefficient map
bin.site.pc.map <- ggmap(basemap, maprange = TRUE) +
  geom_point(data = bin.DIRT.site.mods, aes(-Longitude, Latitude, color = c), shape = 16, size = 3) +
  geom_point(data = bin.DIRT.site.mods, aes(-Longitude, Latitude), color = "black", shape = 1, size = 3) +
  scale_x_continuous("Longitude", breaks = c(-120, -115, -110), 
                     labels = c("-120", "-115", "-110"), limits = c(-120, -110)) +
  scale_y_continuous("Latitude", breaks = c(49, 54, 59), 
                     labels = c("49", "54", "59"), limits = c(49, 59.7)) +
  labs(color = "Participation\ncoefficient", title = "Unweighted site modules") +
  scale_colour_gradientn(colours = terrain.colors(10)) +
  scalebar(x.min = attr(basemap, "bb")[[2]],
           y.min = attr(basemap, "bb")[[1]],
           x.max = attr(basemap, "bb")[[4]],
           y.max = attr(basemap, "bb")[[3]],
           dist = 100, anchor = c(x = -119.6, y = 49.6), transform = T, 
           location = "bottomleft", st.size = 3, st.dist = 0.032, dist_unit = "km") +
  theme(panel.border = element_rect(colour = "black", fill = NA),
        plot.title = element_text(hjust = 0.5),
        legend.key = element_rect(fill = "white"))
north2(bin.site.pc.map, x = 0.9, y = 0.9, symbol = 16)

Again, we see no clear spatial patterns, reinforcing the suggestion that local factors structure community differentiation at this scale of analysis.

Now lets plot the species modules. We’ll extract the species modules, wrangle them, and produce a hierarchical edge bundling plot using ggraph and igraph. We’ll also bring in some trait data to highlight groups of phytoplankton with specific attributes. Trait data are available from Supporting Information S2 in Loewen et al. (2021b) available at https://doi.org/10.1002/lno.11694.

# Load packages
library(igraph)
library(ggraph)
library(gridExtra)

# Load phytoplankton trait data
phyto.traits <- read.csv("lno11694-sup-0002-supinfo02.csv", header = T)
phyto.traits$SPECIES.NAME <- gsub(" ", ".", phyto.traits$SPECIES.NAME)
phyto.traits$SPECIES.NAME <- gsub("\\.\\.", ".", phyto.traits$SPECIES.NAME)
rownames(phyto.traits) <- phyto.traits$SPECIES.NAME

# Extract the species module structure
bin.DIRT.sp.mod.1 <- bin.DIRT.list[[2]][[1]][[2]]
bin.DIRT.sp.mod.1.vector <- as.matrix(if_else(colnames(bin.DIRT@moduleWeb) %in% bin.DIRT.sp.mod.1, 1, 0))
bin.DIRT.sp.mod.2 <- bin.DIRT.list[[2]][[2]][[2]]
bin.DIRT.sp.mod.2.vector <- as.matrix(if_else(colnames(bin.DIRT@moduleWeb) %in% bin.DIRT.sp.mod.2, 1, 0))
bin.DIRT.sp.mod.3 <- bin.DIRT.list[[2]][[3]][[2]]
bin.DIRT.sp.mod.3.vector <- as.matrix(if_else(colnames(bin.DIRT@moduleWeb) %in% bin.DIRT.sp.mod.3, 1, 0))
bin.DIRT.sp.mod.4 <- bin.DIRT.list[[2]][[4]][[2]]
bin.DIRT.sp.mod.4.vector <- as.matrix(if_else(colnames(bin.DIRT@moduleWeb) %in% bin.DIRT.sp.mod.4, 1, 0))
bin.DIRT.sp.mod.5 <- bin.DIRT.list[[2]][[5]][[2]]
bin.DIRT.sp.mod.5.vector <- as.matrix(if_else(colnames(bin.DIRT@moduleWeb) %in% bin.DIRT.sp.mod.5, 1, 0))
bin.DIRT.sp.mod.6 <- bin.DIRT.list[[2]][[6]][[2]]
bin.DIRT.sp.mod.6.vector <- as.matrix(if_else(colnames(bin.DIRT@moduleWeb) %in% bin.DIRT.sp.mod.6, 1, 0))

# Wrangle species module structure
bin.DIRT.sp.mods <- cbind(bin.DIRT.sp.mod.1.vector, bin.DIRT.sp.mod.2.vector, bin.DIRT.sp.mod.3.vector, bin.DIRT.sp.mod.4.vector, bin.DIRT.sp.mod.5.vector, bin.DIRT.sp.mod.6.vector)

rownames(bin.DIRT.sp.mods) <- colnames(bin.DIRT@moduleWeb)
bin.DIRT.sp.mods <- merge(bin.DIRT.sp.mods, bin.DIRT.cz.higher$c, by = "row.names")
rownames(bin.DIRT.sp.mods) <- bin.DIRT.sp.mods$Row.names
bin.DIRT.sp.mods <- bin.DIRT.sp.mods[, -1]

colnames(bin.DIRT.sp.mods) <- c("Mod01", "Mod02", "Mod03", "Mod04", "Mod05", "Mod06", "c")

bin.DIRT.sp.mods['Mods'] <- NA
bin.DIRT.sp.mods$Mods[bin.DIRT.sp.mods$Mod01 == 1] <- "Mod01"
bin.DIRT.sp.mods$Mods[bin.DIRT.sp.mods$Mod02 == 1] <- "Mod02"
bin.DIRT.sp.mods$Mods[bin.DIRT.sp.mods$Mod03 == 1] <- "Mod03"
bin.DIRT.sp.mods$Mods[bin.DIRT.sp.mods$Mod04 == 1] <- "Mod04"
bin.DIRT.sp.mods$Mods[bin.DIRT.sp.mods$Mod05 == 1] <- "Mod05"
bin.DIRT.sp.mods$Mods[bin.DIRT.sp.mods$Mod06 == 1] <- "Mod06"
bin.DIRT.sp.mods$Mods <- as.factor(bin.DIRT.sp.mods$Mods)

#  Create adjacency matrix and wrangle edge lists
bin.sp.adjac <- as.one.mode(phyto.comm.aug.bin, fill = 0, project = "higher", weighted = TRUE)

bin.sp.origins <- data.frame(c("origin", "origin", "origin", "origin",  "origin",  "origin"), 
                             c("Mod01", "Mod02", "Mod03", "Mod04", "Mod05", "Mod06"))
names(bin.sp.origins) <- c("from", "to")
bin.sp.mods.edges <- as.data.frame(bin.DIRT.sp.mods$Mods)
colnames(bin.sp.mods.edges) <- c("from")
bin.sp.mods.edges$to <- rownames(bin.DIRT.sp.mods)
bin.sp.mods.edges <- bin.sp.mods.edges[order(bin.sp.mods.edges$from), ]
bin.sp.mods.edges <- rbind(bin.sp.origins, bin.sp.mods.edges)

bin.sp.temp.graph <- graph_from_adjacency_matrix(bin.sp.adjac, mode = "undirected", weighted = TRUE, diag = FALSE)
bin.sp.temp.graph.edge.att <- E(bin.sp.temp.graph)$weight
bin.sp.temp.graph.edge.list <- get.edgelist(bin.sp.temp.graph)
bin.sp.mods.connect <- as.data.frame(cbind(bin.sp.temp.graph.edge.list, bin.sp.temp.graph.edge.att))
colnames(bin.sp.mods.connect) <- c("from", "to", "value")
bin.sp.mods.connect$value <- as.numeric(bin.sp.mods.connect$value)

bin.sp.mods.colsums <- as.data.frame(colSums (phyto.comm.aug.bin))
bin.sp.mods.colsums.match <- as.data.frame(bin.sp.mods.colsums$`colSums(phyto.comm.aug.bin)`[match(bin.sp.mods.edges$to, rownames(bin.sp.mods.colsums))])
bin.sp.mods.colsums.match <- rbind(c(NA), bin.sp.mods.colsums.match)
colnames(bin.sp.mods.colsums.match) <- "occur"

# Create data.frame of vertices
bin.sp.mods.vertices <- data.frame(name = unique(c(as.character(bin.sp.mods.edges$from), as.character(bin.sp.mods.edges$to))), value = bin.sp.mods.colsums.match)
bin.sp.mods.vertices$group <- bin.sp.mods.edges$from[match(bin.sp.mods.vertices$name, bin.sp.mods.edges$to)]

# Calculate angle of labels
bin.sp.mods.vertices$id <- NA
bin.sp.mods.myleaves <- which(is.na(match(bin.sp.mods.vertices$name, bin.sp.mods.edges$from)))
bin.sp.mods.nleaves <- length(bin.sp.mods.myleaves)
bin.sp.mods.vertices$id[bin.sp.mods.myleaves]<-seq(1:bin.sp.mods.nleaves)
bin.sp.mods.vertices$angle <- 90 - 360 * (bin.sp.mods.vertices$id - 22) / bin.sp.mods.nleaves

# Calculate the alignment of labels
bin.sp.mods.vertices$hjust <- ifelse((bin.sp.mods.vertices$angle < 90.01 & bin.sp.mods.vertices$angle > -90), 1, 0)

# Flip label angles
bin.sp.mods.vertices$angle <- ifelse(bin.sp.mods.vertices$angle < -90 | bin.sp.mods.vertices$angle > 90, bin.sp.mods.vertices$angle + 180, bin.sp.mods.vertices$angle)

# Add trait data
bin.sp.mods.vertices <- merge(bin.sp.mods.vertices, phyto.traits["SUSPECTED_TOXIN_PRODUCING"], by.x = "name", by.y = 0, all.x = TRUE)
bin.sp.mods.vertices <- merge(bin.sp.mods.vertices, phyto.traits["SILICA_USING"], by.x = "name", by.y = 0, all.x = TRUE)
bin.sp.mods.vertices <- merge(bin.sp.mods.vertices, phyto.traits["CHL_B"], by.x = "name", by.y = 0, all.x = TRUE)
bin.sp.mods.vertices <- merge(bin.sp.mods.vertices, phyto.traits["CHL_C"], by.x = "name", by.y = 0, all.x = TRUE)

# Create a graph object
bin.sp.mods.mygraph <- graph_from_data_frame(bin.sp.mods.edges, vertices = bin.sp.mods.vertices)

# Create connection objects
bin.sp.mods.from <- match(bin.sp.mods.connect$from, bin.sp.mods.vertices$name)
bin.sp.mods.to <- match(bin.sp.mods.connect$to, bin.sp.mods.vertices$name)
# Plot binary hierarchical edge bundling plot
bin.sp.edge.group.plot <- ggraph(bin.sp.mods.mygraph, layout = 'dendrogram', circular = TRUE) + 
  geom_conn_bundle(data = get_con(from = bin.sp.mods.from, to = bin.sp.mods.to, value = bin.sp.mods.connect$value),
                   edge_alpha = 0.1, width = 0.3, tension = 0.8, aes(colour = value), show.legend = F) + 
  scale_edge_color_continuous(low = "cornsilk2", high = "red4") + 
  geom_node_text(aes(x = x * 1.15, y = y * 1.15, filter = leaf, label = name, angle = angle, hjust = hjust, colour = group), 
                 size = 3, alpha = 0.6) + 
  geom_node_point(aes(filter = leaf, x = x * 1.07, y = y * 1.07, fill = factor(group), size = occur), 
                  alpha = 0.6, shape = 21, stroke = 0, show.legend = F) + 
  scale_size_continuous(range = c(0.1, 10)) + 
  labs(color = "Modules", title = "Unweighted species modules") + 
  theme_void() + 
  theme(plot.title = element_text(hjust=0.5), 
        plot.margin = unit(c(0, 0, 0, 0), "cm")) + 
  expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5))
bin.sp.edge.group.plot

This plot shows the module structure of phytoplankton species, with different colours indicating community groups. Node size indicates the frequency that each species occurs in the dataset and edges reflect species co-occurrence. Darker edges indicate more instances of co-occurrence; however, patterns may be obscured in cases with large numbers of co-occurring taxa.

We can see that module 6 (in pink) is the most species rich, while module 3 (in green) is the most species poor. We can also observe that module 2 (in gold) has several taxa that occur in a large number of lakes, including Plagioselmis nannoplanctica, Katablepharis ovalis, and species of the genus Chromulinaceae.

We can also visualize the distribution of traits among species modules. For example, we can highlight those taxa that are suspected toxin producers, silica users, or chlorophyll b and c producers.

# Plot identifying suspected toxin producers
bin.sp.edge.toxin.plot <- ggraph(bin.sp.mods.mygraph, layout = 'dendrogram', circular = TRUE) +
  geom_conn_bundle(data = get_con(from = bin.sp.mods.from, to = bin.sp.mods.to, value = bin.sp.mods.connect$value),
                   edge_alpha = 0.1, width = 0.3, tension = 0.8, aes(colour = value)) +
  scale_edge_color_continuous(low = "cornsilk2", high = "red4") +
  geom_node_text(aes(x = x * 1.15, y = y * 1.15, filter = leaf, label = name, angle = angle, 
                     hjust = hjust, colour = SUSPECTED_TOXIN_PRODUCING), size = 3, alpha = 0.6) +
  geom_node_point(aes(filter = leaf, x = x * 1.07, y = y * 1.07, fill = factor(group), size = occur), 
                  alpha = 0.6, shape = 21, stroke = 0) +
  scale_size_continuous(range = c(0.1, 10)) +
  labs(title = "Suspected toxin producers") +
  theme_void() +
  theme(legend.position = "none",
        plot.title = element_text(hjust=0.5),
        plot.margin = unit(c(0, 0, 0, 0), "cm")) +
  expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5))
bin.sp.edge.toxin.plot

# Plot identifying species with silica requirement
bin.sp.edge.silica.plot <- ggraph(bin.sp.mods.mygraph, layout = 'dendrogram', circular = TRUE) +
  geom_conn_bundle(data = get_con(from = bin.sp.mods.from, to = bin.sp.mods.to, value = bin.sp.mods.connect$value),
                   edge_alpha = 0.1, width = 0.3, tension = 0.8, aes(colour = value)) +
  scale_edge_color_continuous(low = "cornsilk2", high = "red4") +
  geom_node_text(aes(x = x * 1.15, y = y * 1.15, filter = leaf, label = name, angle = angle, 
                     hjust = hjust, colour = SILICA_USING), size = 3, alpha = 0.6) +
  geom_node_point(aes(filter = leaf, x = x * 1.07, y = y * 1.07, fill = factor(group), size = occur), 
                  alpha = 0.6, shape = 21, stroke = 0) +
  scale_size_continuous(range = c(0.1, 10)) +
  labs(title = "Silica requirement") +
  theme_void() +
  theme(legend.position = "none",
        plot.title = element_text(hjust=0.5),
        plot.margin = unit(c(0, 0, 0, 0), "cm")) +
  expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5))
bin.sp.edge.silica.plot

# Plot identifying chlorophyll b pigmented species
bin.sp.edge.chlb.plot <- ggraph(bin.sp.mods.mygraph, layout = 'dendrogram', circular = TRUE) +
  geom_conn_bundle(data = get_con(from = bin.sp.mods.from, to = bin.sp.mods.to, value = bin.sp.mods.connect$value),
                   edge_alpha = 0.1, width = 0.3, tension = 0.8, aes(colour = value)) +
  scale_edge_color_continuous(low = "cornsilk2", high = "red4") +
  geom_node_text(aes(x = x * 1.15, y = y * 1.15, filter = leaf, label = name, angle = angle, 
                     hjust = hjust, colour = CHL_B), size = 3, alpha = 0.6) +
  geom_node_point(aes(filter = leaf, x = x * 1.07, y = y * 1.07, fill = factor(group), size = occur), 
                  alpha = 0.6, shape = 21, stroke = 0) +
  scale_size_continuous(range = c(0.1, 10)) +
  labs(title = 'Chlorophyll'~italic(b)~'pigmented') +
  theme_void() +
  theme(legend.position = "none",
        plot.title = element_text(hjust=0.5),
        plot.margin = unit(c(0, 0, 0, 0), "cm")) +
  expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5))
bin.sp.edge.chlb.plot

# Plot identifying chlorophyll c pigmented species
bin.sp.edge.chlc.plot <- ggraph(bin.sp.mods.mygraph, layout = 'dendrogram', circular = TRUE) +
  geom_conn_bundle(data = get_con(from = bin.sp.mods.from, to = bin.sp.mods.to, value = bin.sp.mods.connect$value),
                   edge_alpha = 0.1, width = 0.3, tension = 0.8, aes(colour = value)) +
  scale_edge_color_continuous(low = "cornsilk2", high = "red4") +
  geom_node_text(aes(x = x * 1.15, y = y * 1.15, filter = leaf, label = name, angle = angle, 
                     hjust = hjust, colour = CHL_C), size = 3, alpha = 0.6) +
  geom_node_point(aes(filter = leaf, x = x * 1.07, y = y * 1.07, fill = factor(group), size = occur), 
                  alpha = 0.6, shape = 21, stroke = 0) +
  scale_size_continuous(range = c(0.1, 10)) +
  labs(title = 'Chlorophyll'~italic(c)~'pigmented') +
  theme_void() +
  theme(legend.position = "none",
        plot.title = element_text(hjust=0.5),
        plot.margin = unit(c(0, 0, 0, 0), "cm")) +
  expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5))
bin.sp.edge.chlc.plot

From these plot, we can see how different traits are dispersed across modules. For instance, suspected toxin producers occur in each of the modules except number 3 (green), including the relatively common Aphanocapsa delicatissima and Aphanocapsa elachista in module 1 (red), Aphanocapsa rivularis and Aphanizomenon flosaquae in module 2 (gold), Dolichospermum spiroides in module 4 (turquoise), Aphanothece nidulans in module 5 (blue), and Limnothrix sp. in module 6 (pink). Species with silica (i.e. diatoms) requirement and different photosynthetic pigments were also well dispersed across groups. However, chlorophyll b pigmented algae (i.e. chloro- and euglenophytes) were generally more common in modules 5 (blue) and 6 (pink), while chlorophyll c producing species (i.e. ochrophytes, cryptomonads, and dinoflagellates) were more associated with modules 2 (gold), 3 (green), and 4 (turquoise).

5. Comparing module structure of weighted and unweighted networks

Finally, we can compare the module structure of the unweighted network graph (based on presence/absence) to that of a network weighted by species biomass. First, we’ll compare site modules.

# Set random seed for reproducibility of random process
set.seed(99)

# Conduct bipartite modularity analysis on weighted network using Beckett's algorithm
bio.DIRT <- computeModules(phyto.comm.aug.bio, method = "Beckett")

# Determine number of modules
nrow(bio.DIRT@modules) - 1
## [1] 9
# Calculate participation coefficient
bio.DIRT.cz.higher <- czvalues(bio.DIRT, level = "higher")
bio.DIRT.cz.lower <- czvalues(bio.DIRT, level = "lower")

# Check modularity (Q) value of proposed module structure
bio.DIRT@likelihood
## [1] 0.6277079

The algorithm detected nine modules with a modularity value of 0.6277079.

# Extract the site module structure
bio.DIRT.list <- listModuleInformation(bio.DIRT)

bio.DIRT.site.mod.1 <- bio.DIRT.list[[2]][[1]][[1]]
bio.DIRT.site.mod.1.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.1, 1, 0))
bio.DIRT.site.mod.2 <- bio.DIRT.list[[2]][[2]][[1]]
bio.DIRT.site.mod.2.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.2, 1, 0))
bio.DIRT.site.mod.3 <- bio.DIRT.list[[2]][[3]][[1]]
bio.DIRT.site.mod.3.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.3, 1, 0))
bio.DIRT.site.mod.4 <- bio.DIRT.list[[2]][[4]][[1]]
bio.DIRT.site.mod.4.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.4, 1, 0))
bio.DIRT.site.mod.5 <- bio.DIRT.list[[2]][[5]][[1]]
bio.DIRT.site.mod.5.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.5, 1, 0))
bio.DIRT.site.mod.6 <- bio.DIRT.list[[2]][[6]][[1]]
bio.DIRT.site.mod.6.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.6, 1, 0))
bio.DIRT.site.mod.7 <- bio.DIRT.list[[2]][[7]][[1]]
bio.DIRT.site.mod.7.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.7, 1, 0))
bio.DIRT.site.mod.8 <- bio.DIRT.list[[2]][[8]][[1]]
bio.DIRT.site.mod.8.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.8, 1, 0))
bio.DIRT.site.mod.9 <- bio.DIRT.list[[2]][[9]][[1]]
bio.DIRT.site.mod.9.vector <- as.matrix(if_else(rownames(bio.DIRT@moduleWeb) %in% bio.DIRT.site.mod.9, 1, 0))

# Wrangle site module structure
bio.DIRT.site.mods <- cbind(bio.DIRT.site.mod.1.vector, bio.DIRT.site.mod.2.vector, bio.DIRT.site.mod.3.vector, bio.DIRT.site.mod.4.vector, bio.DIRT.site.mod.5.vector, bio.DIRT.site.mod.6.vector, bio.DIRT.site.mod.7.vector, bio.DIRT.site.mod.8.vector, bio.DIRT.site.mod.9.vector)

rownames(bio.DIRT.site.mods) <- rownames(bio.DIRT@moduleWeb)
bio.DIRT.site.mods <- merge(bio.DIRT.site.mods, bio.DIRT.cz.lower$c, by = "row.names")
rownames(bio.DIRT.site.mods) <- bio.DIRT.site.mods$Row.names
bio.DIRT.site.mods <- bio.DIRT.site.mods[, -1]

colnames(bio.DIRT.site.mods) <- c("Mod01", "Mod02", "Mod03", "Mod04", "Mod05", "Mod06", "Mod07", "Mod08", "Mod09", "c")
bio.DIRT.site.mods <- merge(bio.DIRT.site.mods, coords2var, by = "row.names")
rownames(bio.DIRT.site.mods) <- bio.DIRT.site.mods$Row.names
bio.DIRT.site.mods <- bio.DIRT.site.mods[, -1]

bio.DIRT.site.mods['Mods'] <- NA
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod01 == 1] <- "Mod01"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod02 == 1] <- "Mod02"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod03 == 1] <- "Mod03"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod04 == 1] <- "Mod04"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod05 == 1] <- "Mod05"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod06 == 1] <- "Mod06"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod07 == 1] <- "Mod07"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod08 == 1] <- "Mod08"
bio.DIRT.site.mods$Mods[bio.DIRT.site.mods$Mod09 == 1] <- "Mod09"
bio.DIRT.site.mods$Mods <- as.factor(bio.DIRT.site.mods$Mods)

# Plot weighted module map
bio.site.mod.map <- ggmap(basemap, maprange = TRUE) +
  geom_point(data = bio.DIRT.site.mods, aes(-Longitude, Latitude, color = Mods), shape = 16, size = 3) +
  geom_point(data = bio.DIRT.site.mods, aes(-Longitude, Latitude), color = "black", shape = 1, size = 3) +
  scale_x_continuous("Longitude", breaks = c(-120, -115, -110), 
                     labels = c("-120", "-115", "-110"), limits = c(-120, -110)) +
  scale_y_continuous("Latitude", breaks = c(49, 54, 59), 
                     labels = c("49", "54", "59"), limits = c(49, 59.7)) +
  labs(color = "Modules", title = "Weighted site modules") +
  scalebar(x.min = attr(basemap, "bb")[[2]],
           y.min = attr(basemap, "bb")[[1]],
           x.max = attr(basemap, "bb")[[4]],
           y.max = attr(basemap, "bb")[[3]],
           dist = 100, anchor = c(x = -119.6, y = 49.6), transform = T, 
           location = "bottomleft", st.size = 3, st.dist = 0.032, dist_unit = "km") +
  theme(panel.border = element_rect(colour = "black", fill = NA),
        plot.title = element_text(hjust = 0.5),
        legend.key = element_rect(fill = "white"))
# Arrange site module maps
north2(compare.site.mod.map <- grid.arrange(bin.site.mod.map, bio.site.mod.map, nrow = 1), x = 0.95, y = 0.9, symbol = 16)

Comparing these maps, we can see that the module structure of communities is very different depending on whether or not the species are weighted by their biomass contributions. Not only did the weighted analysis detect a larger number of modules (nine), but the spatial arrangement of modules also differs. However, as with the unweighted network, the module structure based on biomass also lacks any clear spatial pattern, suggesting an important role for more local processes in differentiating phytoplankton communities across Alberta, Canada.

We can also look at how species modules differ between the unweighted and weighted networks.

# Extract the species module structure
bio.DIRT.sp.mod.1 <- bio.DIRT.list[[2]][[1]][[2]]
bio.DIRT.sp.mod.1.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.1, 1, 0))
bio.DIRT.sp.mod.2 <- bio.DIRT.list[[2]][[2]][[2]]
bio.DIRT.sp.mod.2.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.2, 1, 0))
bio.DIRT.sp.mod.3 <- bio.DIRT.list[[2]][[3]][[2]]
bio.DIRT.sp.mod.3.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.3, 1, 0))
bio.DIRT.sp.mod.4 <- bio.DIRT.list[[2]][[4]][[2]]
bio.DIRT.sp.mod.4.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.4, 1, 0))
bio.DIRT.sp.mod.5 <- bio.DIRT.list[[2]][[5]][[2]]
bio.DIRT.sp.mod.5.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.5, 1, 0))
bio.DIRT.sp.mod.6 <- bio.DIRT.list[[2]][[6]][[2]]
bio.DIRT.sp.mod.6.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.6, 1, 0))
bio.DIRT.sp.mod.7 <- bio.DIRT.list[[2]][[7]][[2]]
bio.DIRT.sp.mod.7.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.7, 1, 0))
bio.DIRT.sp.mod.8 <- bio.DIRT.list[[2]][[8]][[2]]
bio.DIRT.sp.mod.8.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.8, 1, 0))
bio.DIRT.sp.mod.9 <- bio.DIRT.list[[2]][[9]][[2]]
bio.DIRT.sp.mod.9.vector <- as.matrix(if_else(colnames(bio.DIRT@moduleWeb) %in% bio.DIRT.sp.mod.9, 1, 0))

# Wrangle species module structure
bio.DIRT.sp.mods <- cbind(bio.DIRT.sp.mod.1.vector, bio.DIRT.sp.mod.2.vector, bio.DIRT.sp.mod.3.vector, bio.DIRT.sp.mod.4.vector, bio.DIRT.sp.mod.5.vector, bio.DIRT.sp.mod.6.vector, bio.DIRT.sp.mod.7.vector, bio.DIRT.sp.mod.8.vector, bio.DIRT.sp.mod.9.vector)

rownames(bio.DIRT.sp.mods) <- colnames(bio.DIRT@moduleWeb)
bio.DIRT.sp.mods <- merge(bio.DIRT.sp.mods, bio.DIRT.cz.higher$c, by = "row.names")
rownames(bio.DIRT.sp.mods) <- bio.DIRT.sp.mods$Row.names
bio.DIRT.sp.mods <- bio.DIRT.sp.mods[, -1]

colnames(bio.DIRT.sp.mods) <- c("Mod01", "Mod02", "Mod03", "Mod04", "Mod05", "Mod06", "Mod07", "Mod08", "Mod09", "c")

bio.DIRT.sp.mods['Mods'] <- NA
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod01 == 1] <- "Mod01"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod02 == 1] <- "Mod02"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod03 == 1] <- "Mod03"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod04 == 1] <- "Mod04"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod05 == 1] <- "Mod05"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod06 == 1] <- "Mod06"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod07 == 1] <- "Mod07"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod08 == 1] <- "Mod08"
bio.DIRT.sp.mods$Mods[bio.DIRT.sp.mods$Mod09 == 1] <- "Mod09"
bio.DIRT.sp.mods$Mods <- as.factor(bio.DIRT.sp.mods$Mods)

#  Create adjacency matrix and wrangle edge lists
bio.sp.adjac <- as.one.mode(phyto.comm.aug.bio, fill = 0, project = "higher", weighted = TRUE)

bio.sp.origins <- data.frame(c("origin", "origin", "origin", "origin",  "origin", "origin",  "origin",  "origin",  "origin"), c("Mod01", "Mod02", "Mod03", "Mod04", "Mod05", "Mod06", "Mod07", "Mod08", "Mod09"))
names(bio.sp.origins) <- c("from", "to")
bio.sp.mods.edges <- as.data.frame(bio.DIRT.sp.mods$Mods)
colnames(bio.sp.mods.edges) <- c("from")
bio.sp.mods.edges$to <- rownames(bio.DIRT.sp.mods)
bio.sp.mods.edges <- bio.sp.mods.edges[order(bio.sp.mods.edges$from), ]
bio.sp.mods.edges <- rbind(bio.sp.origins, bio.sp.mods.edges)

bio.sp.temp.graph <- graph_from_adjacency_matrix(bio.sp.adjac, mode = "undirected", weighted = TRUE, diag = FALSE)
bio.sp.temp.graph.edge.att <- E(bio.sp.temp.graph)$weight
bio.sp.temp.graph.edge.list <- get.edgelist(bio.sp.temp.graph)
bio.sp.mods.connect <- as.data.frame(cbind(bio.sp.temp.graph.edge.list, bio.sp.temp.graph.edge.att))
colnames(bio.sp.mods.connect) <- c("from", "to", "value")
bio.sp.mods.connect$value <- as.numeric(bio.sp.mods.connect$value)

bio.sp.mods.colsums <- as.data.frame(colSums (phyto.comm.aug.bio))
bio.sp.mods.colsums.match <- as.data.frame(bio.sp.mods.colsums$`colSums(phyto.comm.aug.bio)`[match(bio.sp.mods.edges$to, rownames(bio.sp.mods.colsums))])
bio.sp.mods.colsums.match <- rbind(c(NA), bio.sp.mods.colsums.match)
colnames(bio.sp.mods.colsums.match) <- "occur"

# Create data.frame of vertices
bio.sp.mods.vertices <- data.frame(name = unique(c(as.character(bio.sp.mods.edges$from), as.character(bio.sp.mods.edges$to))), value = bio.sp.mods.colsums.match)
bio.sp.mods.vertices$group <- bio.sp.mods.edges$from[match(bio.sp.mods.vertices$name, bio.sp.mods.edges$to)]

# Calculate angle of labels
bio.sp.mods.vertices$id <- NA
bio.sp.mods.myleaves <- which(is.na(match(bio.sp.mods.vertices$name, bio.sp.mods.edges$from)))
bio.sp.mods.nleaves <- length(bio.sp.mods.myleaves)
bio.sp.mods.vertices$id[bio.sp.mods.myleaves]<-seq(1:bio.sp.mods.nleaves)
bio.sp.mods.vertices$angle <- 90 - 360 * (bio.sp.mods.vertices$id - 22) / bio.sp.mods.nleaves

# Calculate the alignment of labels
bio.sp.mods.vertices$hjust <- ifelse((bio.sp.mods.vertices$angle < 90.01 & bio.sp.mods.vertices$angle > -90), 1, 0)

# Flip label angles
bio.sp.mods.vertices$angle <- ifelse(bio.sp.mods.vertices$angle < -90 | bio.sp.mods.vertices$angle > 90, bio.sp.mods.vertices$angle + 180, bio.sp.mods.vertices$angle)

# Create a graph object
bio.sp.mods.mygraph <- graph_from_data_frame(bio.sp.mods.edges, vertices = bio.sp.mods.vertices)

# Create connection objects
bio.sp.mods.from <- match(bio.sp.mods.connect$from, bio.sp.mods.vertices$name)
bio.sp.mods.to <- match(bio.sp.mods.connect$to, bio.sp.mods.vertices$name)

# Plot weighted hierarchical edge bundling plot
bio.sp.edge.group.plot <- ggraph(bio.sp.mods.mygraph, layout = 'dendrogram', circular = TRUE) +
  geom_conn_bundle(data = get_con(from = bio.sp.mods.from, to = bio.sp.mods.to, value = bio.sp.mods.connect$value),
                   edge_alpha = 0.1, width = 0.3, tension = 0.8, aes(colour = value), show.legend = F) +
  scale_edge_color_continuous(low = "cornsilk2", high = "red4") +
  geom_node_text(aes(x = x * 1.15, y = y * 1.15, filter = leaf, label = name, angle = angle, 
                     hjust = hjust, colour = group), size = 3, alpha = 0.6) +
  geom_node_point(aes(filter = leaf, x = x * 1.07, y = y * 1.07, fill = factor(group), size = occur), 
                  alpha = 0.6, shape = 21, stroke = 0, show.legend = F) +
  scale_size_continuous(range = c(0.1, 10)) +
  labs(color = "Modules", title = "Weighted species modules") +
  theme_void() +
  theme(plot.title = element_text(hjust = 0.5),
        plot.margin = unit(c(0, 0, 0, 0), "cm")) +
  expand_limits(x = c(-1.5, 1.5), y = c(-1.5, 1.5))
# Arrange species module plots
compare.sp.edge.group.plot <- grid.arrange(bio.sp.edge.group.plot, bin.sp.edge.group.plot, nrow = 2)

Again, we see very different module structures from the weighted and unweighted networks. Note that node sizes indicates relative biomass in the weighted network, which is concentrated in a few key species, including toxin producing Aphanizomenon gracile in module 2 (orange), Aphanizomenon flosaquae in module 4 (green), Dolichospermum spiroides in module 8 (purple), and Planktothrix agardhii in module 9 (pink).

As with the site module structure, species results lend themselves to further analysis to understand associations among species within and between groups, including their functional trait and phylogenetic similarities.

6. References

Barber, M.J. (2007). Modularity and community detection in bipartite networks. Physical Review E, 76, 066102.

Beckett, S.J. (2016). Improved community detection in weighted bipartite networks. Royal Society Open Science, 3, 140536.

Dormann, C.F., Gruber, B., & Fründ, J. (2008). Introducing the bipartite package: analysing ecological networks. R News, 8, 8–11.

Guimerà, R, & Amaral, L.A.N. (2005). Functional cartography of complex metabolic networks. Nature, 433, 895–900.

Loewen, C.J.G., Wyatt, F.R., Mortimer, C.A., Vinebrooke, R.D., & Zurawell, R.W. (2020). Multiscale drivers of phytoplankton communities in north temperate lakes. Ecological Applications, 30, e02102.

Loewen C.J.G., Jackson D.A., Chu C., Alofs K.M., Hansen G.J.A., Honsey A.E., Minns C.K., & Wehrly K.E. (2021a). Bioregions are predominantly climatic for fishes of northern lakes. Global Ecology and Biogeography.

Loewen. C.J.G., Vinebrooke, R.D., & Zurawell R.W. (2021b). Quantifying seasonal succession of phytoplankton trait-environment associations in human-altered landscapes. Limnology and Oceanography, 66, 1409-1423.

Miyauchi, A., & Sukegawa, N. (2015). Maximizing Barber’s bipartite modularity is also hard. Optimization Letters, 9, 897–913.

Newman, M.E.J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69, 026113.

Oksanen, J., Blanchet, F.G., Friendly, M., Kindt, R., Legendre, P., McGlinn, D., … Wagner, H. (2020). vegan: community ecology package. R package version 2.5-7.

Raghavan, U.N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76, 036106.

Strona, G., Nappo, D., Boccacci, F., Fattorini, S., & San-Miguel-Ayanz, J. (2014). A fast and unbiased procedure to randomize ecological binary matrices with fixed row and column totals. Nature Communications, 5, 4114.